Skip to content

feat: migrate from AST to import + inspect#265

Open
KAJdev wants to merge 5 commits intomainfrom
zeke/ae-2438-migrate-from-ast-walking-to-import-parse-objects-in-memory
Open

feat: migrate from AST to import + inspect#265
KAJdev wants to merge 5 commits intomainfrom
zeke/ae-2438-migrate-from-ast-walking-to-import-parse-objects-in-memory

Conversation

@KAJdev
Copy link
Contributor

@KAJdev KAJdev commented Mar 11, 2026

Replace AST-based resource discovery with runtime import scanning

the old scanner (RemoteDecoratorScanner, ResourceDiscovery) used Python's AST module to statically parse source files and find @remote decorated functions and resource configurations. This worked for simple cases but broke on any dynamic Python pattern (computed decorators, conditional definitions, programmatic resource construction, class inheritance, etc.)

this approach imports each user module via importlib and inspects live objects. The @remote decorator and Endpoint class stamp a __remote_config__ dict on decorated functions, so the scanner just walks module members looking for that attribute. this means anything that runs at import time is discovered correctly, regardless of how it was constructed.

How it works

RuntimeScanner in scanner.py:

  1. Collects .py files in the project (respecting .flashignore)
  2. Imports each file as a module via importlib.util
  3. Walks module members for __remote_config__ attributes
  4. Extracts RemoteFunctionMetadata from the live objects (resource config, function signature, dependencies, paths)
  5. Runs a lightweight AST pass on function bodies only for cross-call analysis (detecting when one @remote function calls another)

Import failures are collected and surfaced clearly in both flash run and flash deploy. Syntax errors, missing dependencies, and validation errors (e.g. forgetting name= on a load-balanced endpoint) show the filename and error message. The build stops immediately on fatal errors instead of silently continuing with zero discovered functions.

Fixes AE-2321

Other fixes bundled in

  • idleTimeout missing from deploy pipeline: _extract_config_properties() in manifest.py and create_resource_from_manifest() in resource_provisioner.py were not passing idleTimeout through, so all endpoints got the default 60s regardless of what the user specified.
  • _wrapped_class on RemoteClassWrapper: Added a class-level attribute so the scanner can recover the original class name and method definitions from wrapped classes.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates Flash’s project scanning and resource discovery from AST-based parsing to runtime importing + object inspection, enabling detection of dynamically constructed endpoints/resources while also fixing missing idleTimeout propagation.

Changes:

  • Replace RemoteDecoratorScanner/ResourceDiscovery patterns with RuntimeScanner-driven import+inspect scanning across build/run flows.
  • Update Endpoint to support nameless QB decorator mode (derive name from decorated target) while requiring name/id only for image= client mode and enforcing name for LB route registration.
  • Propagate idleTimeout through manifest generation and runtime provisioning; update/trim tests accordingly.

Reviewed changes

Copilot reviewed 24 out of 24 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/unit/test_p2_remaining_gaps_2.py Updates scanner references from RemoteDecoratorScanner to RuntimeScanner for empty-project behavior.
tests/unit/test_p2_remaining_gaps.py Updates nested-class/conditional scanning tests to use RuntimeScanner.
tests/unit/test_p2_gaps.py Updates docstring extraction test to use RuntimeScanner.
tests/unit/test_endpoint.py Adjusts validation expectations and adds coverage for nameless QB decorator mode.
tests/unit/test_discovery_endpoint.py Removes ResourceDiscovery endpoint pattern tests (scanner/discovery migration).
tests/unit/test_discovery.py Removes ResourceDiscovery unit tests (scanner/discovery migration).
tests/unit/cli/test_run.py Updates mocks for _scan_project_workers new return shape.
tests/unit/cli/commands/test_run_endpoint.py Updates _scan_project_workers call sites for new return shape.
tests/unit/cli/commands/test_run.py Updates _scan_project_workers call sites for new return shape.
tests/unit/cli/commands/test_build.py Updates test fixtures to construct valid Live* resources under runtime import scanning (adds name=).
tests/unit/cli/commands/build_utils/test_scanner_load_balancer.py Removes AST-scanner FastAPI detection tests (scanner migration).
tests/unit/cli/commands/build_utils/test_scanner_endpoint.py Removes AST-scanner Endpoint pattern tests (scanner migration).
tests/unit/cli/commands/build_utils/test_scanner.py Removes AST-scanner test suite (scanner migration).
tests/unit/cli/commands/build_utils/test_path_utilities.py Updates scanner import to RuntimeScanner for LB/QB flags and init exclusion tests.
tests/integration/test_lb_remote_execution.py Updates integration assertions for RuntimeScanner and name normalization (-fb suffix).
tests/integration/test_build_pipeline.py Updates scanner import and expected resource naming (-fb suffix); minor docstring normalization.
src/runpod_flash/runtime/resource_provisioner.py Passes idleTimeout from manifest into deployment kwargs.
src/runpod_flash/execute_class.py Adds _wrapped_class to RemoteClassWrapper for scanner/introspection to recover original class metadata.
src/runpod_flash/endpoint.py Allows nameless QB decorator mode (derive name in __call__), tightens LB route validation, clarifies image-mode requirement.
src/runpod_flash/core/discovery.py Removes AST-based ResourceDiscovery implementation.
src/runpod_flash/cli/commands/run.py Switches runtime worker scan/discovery to import+inspect; surfaces scanner import errors; updates worker scanning API.
src/runpod_flash/cli/commands/build_utils/scanner.py Introduces RuntimeScanner import+inspect discovery with AST-only cross-call analysis pass.
src/runpod_flash/cli/commands/build_utils/manifest.py Ensures idleTimeout is included in manifest resource config extraction.
src/runpod_flash/cli/commands/build.py Uses RuntimeScanner in build; prints import failures and exits when none are discovered.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Contributor

@runpod-Henrik runpod-Henrik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1. Architecture shift — sound motivation, correct implementation

AST parsing breaking on computed decorators, conditional definitions, and programmatic resource construction is a real limitation. The __remote_config__ stamp pattern is the right fix: @remote and Endpoint both attach a dict at decoration time, and the scanner just walks live module members looking for it. Dynamic patterns that defeated AST parsing work correctly under import inspection.

The idleTimeout fix (manifest.py + resource_provisioner.py) and _wrapped_class on RemoteClassWrapper are clean bundled fixes that correctly close gaps in the manifest pipeline and class introspection.

2. Bug: partial build failure silently produces incomplete manifest

# build.py
if scanner.import_errors:
    console.print("\n[red bold]Failed to load:[/red bold]")
    for filename, err in scanner.import_errors.items():
        console.print(f"  [red]{filename}[/red]: {err}")
    if not remote_functions:      # ← only aborts if ALL modules failed
        raise typer.Exit(1)

If a user has five worker files and two fail to import, the build continues with a three-worker manifest, prints the errors in red, exits 0, and deploys. The user has a silently incomplete app.

The if not remote_functions guard is too permissive. At minimum, import failures should print a warning that the build is incomplete and require --allow-partial or similar to proceed. A hard stop on any import error is the safer default.

3. Issue: no timeout on module imports

_import_module_from_file() calls spec.loader.exec_module(module) with no timeout. Module-level blocking code — a database connection, time.sleep(), or an infinite loop — will hang flash build indefinitely with no output. The AST scanner was immune to this; runtime import is not. A per-file timeout (e.g., concurrent.futures with a 10s limit) would make failure explicit.

4. High: ~2,346 lines of scanner tests deleted, zero unit tests added

Three test files deleted:

  • test_scanner.py — 1,480 lines
  • test_scanner_endpoint.py — 600 lines
  • test_scanner_load_balancer.py — 266 lines

The PR adds no new unit tests for RuntimeScanner. test_build_pipeline.py has 15 line-for-line updates but no new scenarios. What's now untested:

  • _import_module_from_file() error paths (syntax error, missing dep, import exception)
  • _find_remote_decorated() / _find_endpoint_instances() introspection edge cases
  • _analyze_cross_calls_ast() with files that failed to import
  • Cross-call detection partial-failure degradation (item 2 above)
  • Module cleanup in finally block
  • __flash_local__ flag propagation through _metadata_from_remote_config()
  • LB route stamp correctness (method / path from @ep.get(...))

The old tests covered these exhaustively. The new implementation is simpler, but no behavioral guarantees are locked in.

5. Testing — what remains

Integration tests in test_build_pipeline.py and test_lb_remote_execution.py are updated and cover the happy path. The test name change ("test-gpu""test-gpu-fb") correctly reflects that runtime scanning reads the actual resource name (with suffix) rather than the AST string literal — that's a latent correctness improvement.

Verdict

The architectural direction is right and the implementation looks correct for the cases it targets. Two things should be addressed before merge: the partial-build silent-success behavior (item 2 — real user impact) and the unit test gap (item 4 — behavioral guarantees are unverified for error paths). The import timeout (item 3) is lower priority but worth a follow-up.


if not isinstance(decorator.func, ast.Attribute):
continue
remote_objects = _find_remote_decorated(module)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is outside the scope of this PR, but I this feels like just a couple steps away from resolving things outside of decorated functions and classes and shipping them

like if we are primarily working at the module level, could we just ship the whole module over? and that way we could not have to worry about imports inside of functions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants